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PRINCIPAL COMPONENT ANALYSIS 


Principal Component Analysis is an unsupervised learning algorithm that is 
used for the dimensionality reduction in machine learning. It is a statistical 
process that converts the observations of correlated features into a set of 
linearly uncorrelated features with the help of orthogonal transformation. 
These new transformed features are called the Principal Components. It is 
one of the popular tools that is used for exploratory data analysis and 
predictive modeling. It is a technique to draw strong patterns from the 
given dataset by reducing the variances. 
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PRINCIPAL COMPONENT ANALYSIS 


PCA works by considering the variance of each attribute because the high 
attribute shows the good split between the classes, and hence it reduces 
the dimensionality. 
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EXAMPLE 
V 
Math 10 
Physics 6 4 5 3 2.8 | 
Chemistry p 9 10 2.5 1.3 2 
ск 5 7 6 2 4 
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EXAMPLE 
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EXAMPLE 
Student! | Student? | Student3 | Student4 | Student | Studentó | 
Math 10 11 8 3 2 | 
Physics 6 4 5 3 2.8 | 


...then we can plot the data on 
a 2-Dimensional x/y graph. 


Gene 2 
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EXAMPLE 
| Student! | Student? | Student? | Student | Student | Student 5 
Math 10 E 8 3 2 | 
Physics 6 4 5 3 2.8 | 
Chemistry 12 9 10 2.5 1.3 2 


If we measured 3 genes, we 
would add another axis to the 
graph and make it look "3-D" (i.e. 
3-dimensional) 


Gene 2 
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EXAMPLE 


V 


Math | 
Physics 6 5 3 2.8 | 
Chemistry p 10 2.5 1.3 2 
Gene 4 5 6 2 4 7 


If we measured 4 genes, 
however, we can no longer 
plot the data - 4 genes require 
4 dimensions. 
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EXAMPLE 
Math 10 E 8 3 2 | 
Physics 6 4 5 3 2.8 | 
Chemistry 12 9 10 2.5 1.3 2 
ск 5 7 6 2 4 


...We'll also talk about how РСА 
can tell us which gene (or variable) 
is the most valuable for clustering 

the data. 


«|o 9 o 9 
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EXAMPLE 
|_| Student! | Student2 | Student3 | Student4 | Student? | Studenté | 
Math 10 Е 8 3 2 | 
Physics 6 4 5 3 2.8 | 
Math 
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EXAMPLE 
L|. Student! | Student? | Student? | Student | Student | Student 5 
Math 10 1 8 3 2 | 
Physics 6 4 5 3 2.8 | 


Math 


Average values, 
center of the 9 
data 
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EXAMPLE 


Math 
Physics 


Math 
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EXAMPLE 
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Physics 
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EXAMPLE 
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Random line 
that goes 
through origin 


Physics 
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EXAMPLE 
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Physics 
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MORE STEPS 


= Check white board 
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THANK YOU 
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